Evaluation of Automatic Summaries: Metrics under Varying Data Conditions
نویسندگان
چکیده
In evaluation of automatic summaries, it is necessary to employ multiple topics and human-produced models in order for the assessment to be stable and reliable. However, providing multiple topics and models is costly and time-consuming. This paper examines the relation between the number of available models and topics and the correlations with human judgment obtained by automatic metrics ROUGE and BE, as well as the manual Pyramid method. Testing all these methods on the same data set, taken from the TAC 2008 Summarization track, allows us to compare and contrast the methods under different conditions.
منابع مشابه
ارائه یک سیستم هوشمند و معناگرا برای ارزیابی سیستم های خلاصه ساز متون
Nowadays summarizers and machine translators have attracted much attention to themselves, and many activities on making such tools have been done around the world. For Farsi like the other languages there have been efforts in this field. So evaluating such tools has a great importance. Human evaluations of machine summarization are extensive but expensive. Human evaluations can take months to f...
متن کاملThe Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language
Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...
متن کاملResQu: A Framework for Automatic Evaluation of Knowledge-Driven Automatic Summarization
JAYKUMAR, NISHITA. M.S., Department of Computer Science and Engineering, Wright State University, 2016. ResQu: A Framework for Automatic Evaluation of Knowledge-Driven Automatic Summarization. Automatic generation of summaries that capture the salient aspects of a search resultset (i.e., automatic summarization) has become an important task in biomedical research. Automatic summarization offers...
متن کاملQARLA: A Framework for the Evaluation of Text Summarization Systems
This paper presents a probabilistic framework, QARLA, for the evaluation of text summarisation systems. The input of the framework is a set of manual (reference) summaries, a set of baseline (automatic) summaries and a set of similarity metrics between summaries. It provides i) a measure to evaluate the quality of any set of similarity metrics, ii) a measure to evaluate the quality of a summary...
متن کاملFeature Selection for Summarising: The Sunderland DUC 2004 Experience
In this paper we describe our participation in task 1-very short single-document summaries in DUC 2004. The task chosen is related to our research project, which aims to produce abstracting summaries to improve search engine result summaries. DUC allowed us to produce summaries no longer than 75 characters, therefore we focused on feature selection to produce a set of key words as summaries ins...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009